Goto

Collaborating Authors

 travel mode choice


Improving Trip Mode Choice Modeling Using Ensemble Synthesizer (ENSY)

Parsi, Amirhossein, Jafari, Melina, Sabzekar, Sina, Amini, Zahra

arXiv.org Artificial Intelligence

Accurate classification of mode choice datasets is crucial for transportation planning and decision-making processes. However, conventional classification models often struggle to adequately capture the nuanced patterns of minority classes within these datasets, leading to sub-optimal accuracy. In response to this challenge, we present Ensemble Synthesizer (ENSY) which leverages probability distribution for data augmentation, a novel data model tailored specifically for enhancing classification accuracy in mode choice datasets. In our study, ENSY demonstrates remarkable efficacy by nearly quadrupling the F1 score of minority classes and improving overall classification accuracy by nearly 3%. To assess its performance comprehensively, we compare ENSY against various augmentation techniques including Random Oversampling, SMOTE-NC, and CTGAN. Through experimentation, ENSY consistently outperforms these methods across various scenarios, underscoring its robustness and effectiveness


Combining data from multiple sources for urban travel mode choice modelling

Grzenda, Maciej, Luckner, Marcin, Zawieska, Jakub, Wrona, Przemysław

arXiv.org Artificial Intelligence

Demand for sustainable mobility is particularly high in urban areas. Hence, there is a growing need to predict when people will decide to use different travel modes with an emphasis on environmentally friendly travel modes. As travel mode choice (TMC) is influenced by multiple factors, in a growing number of cases machine learning methods are used to predict travel mode choices given respondent and journey features. Typically, travel diaries are used to provide core relevant data. However, other features such as attributes of mode alternatives including, but not limited to travel times, and, in the case of public transport (PT), also walking distances have a major impact on whether a person decides to use a travel mode of interest. Hence, in this work, we propose an architecture of a software platform performing the data fusion combining data documenting journeys with the features calculated to summarise transport options available for these journeys, built environment and environmental factors such as weather conditions possibly influencing travel mode decisions. Furthermore, we propose various novel features, many of which we show to be among the most important for TMC prediction. We propose how stream processing engines and other Big Data systems can be used for their calculation. The data processed by the platform is used to develop machine learning models predicting travel mode choices. To validate the platform, we propose ablation studies investigating the importance of individual feature subsets calculated by it and their impact on the TMC models built with them. In our experiments, we combine survey data, GPS traces, weather and pollution time series, transport model data, and spatial data of the built environment. The growth in the accuracy of TMC models built with the additional features is up to 18.2% compared to the use of core survey data only.


Analyzing Transport Policies in Developing Countries with ABM

Salazar-Serna, Kathleen, Cadavid, Lorena, Franco, Carlos

arXiv.org Artificial Intelligence

Deciphering travel behavior and mode choices is a critical aspect of effective urban transportation system management, particularly in developing countries where unique socio-economic and cultural conditions complicate decision-making. Agent-based simulations offer a valuable tool for modeling transportation systems, enabling a nuanced understanding and policy impact evaluation. This work aims to shed light on the effects of transport policies and analyzes travel behavior by simulating agents making mode choices for their daily commutes. Agents gather information from the environment and their social network to assess the optimal transport option based on personal satisfaction criteria. Our findings, stemming from simulating a free-fare policy for public transit in a developing-country city, reveal a significant influence on decision-making, fostering public service use while positively influencing pollution levels, accident rates, and travel speed.


A prediction and behavioural analysis of machine learning methods for modelling travel mode choice

Martín-Baos, José Ángel, López-Gómez, Julio Alberto, Rodriguez-Benitez, Luis, Hillel, Tim, García-Ródenas, Ricardo

arXiv.org Artificial Intelligence

The emergence of a variety of Machine Learning (ML) approaches for travel mode choice prediction poses an interesting question to transport modellers: which models should be used for which applications? The answer to this question goes beyond simple predictive performance, and is instead a balance of many factors, including behavioural interpretability and explainability, computational complexity, and data efficiency. There is a growing body of research which attempts to compare the predictive performance of different ML classifiers with classical random utility models. However, existing studies typically analyse only the disaggregate predictive performance, ignoring other aspects affecting model choice. Furthermore, many studies are affected by technical limitations, such as the use of inappropriate validation schemes, incorrect sampling for hierarchical data, lack of external validation, and the exclusive use of discrete metrics. We address these limitations by conducting a systematic comparison of different modelling approaches, across multiple modelling problems, in terms of the key factors likely to affect model choice (out-of-sample predictive performance, accuracy of predicted market shares, extraction of behavioural indicators, and computational efficiency). We combine several real world datasets with synthetic datasets, where the data generation function is known. The results indicate that the models with the highest disaggregate predictive performance (namely extreme gradient boosting and random forests) provide poorer estimates of behavioural indicators and aggregate mode shares, and are more expensive to estimate, than other models, including deep neural networks and Multinomial Logit (MNL). It is further observed that the MNL model performs robustly in a variety of situations, though ML techniques can improve the estimates of behavioural indices such as Willingness to Pay.


Distilling Black-Box Travel Mode Choice Model for Behavioral Interpretation

Zhao, Xilei, Zhou, Zhengze, Yan, Xiang, Van Hentenryck, Pascal

arXiv.org Machine Learning

Machine learning has proved to be very successful for making predictions in travel behavior modeling. However, most machine-learning models have complex model structures and offer little or no explanation as to how they arrive at these predictions. Interpretations about travel behavior models are essential for decision makers to understand travelers' preferences and plan policy interventions accordingly. Therefore, this paper proposes to apply and extend the model distillation approach, a model-agnostic machine-learning interpretation method, to explain how a black-box travel mode choice model makes predictions for the entire population and subpopulations of interest. Model distillation aims at compressing knowledge from a complex model (teacher) into an understandable and interpretable model (student). In particular, the paper integrates model distillation with market segmentation to generate more insights by accounting for heterogeneity. Furthermore, the paper provides a comprehensive comparison of student models with the benchmark model (decision tree) and the teacher model (gradient boosting trees) to quantify the fidelity and accuracy of the students' interpretations.


Modeling Stated Preference for Mobility-on-Demand Transit: A Comparison of Machine Learning and Logit Models

Zhao, Xilei, Yan, Xiang, Yu, Alan, Van Hentenryck, Pascal

arXiv.org Artificial Intelligence

Logit models are usually applied when studying individual travel behavior, i.e., to predict travel mode choice and to gain behavioral insights on traveler preferences. Recently, some studies have applied machine learning to model travel mode choice and reported higher out-of-sample prediction accuracy than conventional logit models (e.g., multinomial logit). However, there has not been a comprehensive comparison between logit models and machine learning that covers both prediction and behavioral analysis. This paper aims at addressing this gap by examining the key differences in model development, evaluation, and behavioral interpretation between logit and machine-learning models for travel-mode choice modeling. To complement the theoretical discussions, we also empirically evaluated the two approaches on stated-preference survey data for a new type of transit system integrating high-frequency fixed routes and micro-transit. The results show that machine learning can produce significantly higher predictive accuracy than logit models and are better at capturing the nonlinear relationships between trip attributes and mode-choice outcomes. On the other hand, compared to the multinomial logit model, the best-performing machine-learning model, the random forest model, produces less reasonable behavioral outputs (i.e. marginal effects and elasticities) when they were computed from a standard approach. By introducing some behavioral constraints into the computation of behavioral outputs from a random forest model, however, we obtained better results that are somewhat comparable with the multinomial logit model. We believe that there is great potential in merging ideas from machine learning and conventional statistical methods to develop refined models for travel-behavior research and suggest some possible research directions.